Goal:

Think, explore, & write about what the co-evolutionary interaction between newts & snakes with different genetic architectures (GAs, combination of mutation rate & mutation effect size) can lead to. This markdown is investigation what is up with the different levels of correlation between rectangles and squares (in connection with GA1 tall). After fixing the row vs column error I looked at the correlation data and found that there was less correlation. So I decided to investigate why that might be and run a few more experiments. I am running an experiment to test how changing the square size might impact the calculations. I also plan on changing the interaction rate (but want to look at the math/ feasibility of it). This file contains results discussed in Tall_GA1!

Questions:

What does it mean to be correlated?

Background

I investigated the correlations between snake and newt phenotypes at the local grid locations along with population level phenotype. I found there was a lot of correlation when populations slowly adapt! However, after a closer investigation I found that these correlations were from a rectangle shaped area and not a square one. Here, I rerun the experiment with the correct square grids and discuss how the results differ. I also aim to investigate how changing the area/shape of the grids will influence correlation results. I am also thinking of ways to gather and incorporate allele information.

Experiment

I created a simulation study to observe the co-evolutionary outcome of the newt-snake interaction with different genetic architectures (GAs) in a spatial setting. I hypothesized that we would see an interaction (co-evolutionary arms race) between newt and snake phenotype under some GA combinations when newts and snakes were evolving over geographical space. Each GA is paired with another GA creating 16 combinations.

GA1 experiment values:

Landscape: 20 by 5 grid. A tall map!: 354 H, 35 W New Landscape 28 by 7 grid: 354 H, 35 W

Each GA combination and trial has its own msprime simulation, but the msprime file is shared between the 5 section and 7 section slim runs.

The data

I gather text file data from different files and have to do some table wrangling to get it into a format I can graph. I use information gathered from the entire population; data containing a mean value for the entire map (lit), correlation data based off of local populations that were divided into grids (cor) and data collected from grid based populations (grid). These simulations have a msprime (coalescent/burn in/random genetic variation) and then run for 50,000 generations in SLiM (co-evolution) (all with the same GA values). I run 4 trials for every GA combination and try two different grid sizes 20 by 5 (large) and 24 by 7 (small) in SLiM. These slim simulations use the same msprime simulation, but can statistically be different.

The two data sets in this experiment have the same msprime burn-in (16 msprime trees), but have different slim results. For the most part the additional slim simulation (the ones with a smaller grid) can be used as additional trials. They will only differ in the way correlation and grid values are calculated. Cor and Grid files are effected. Lit file not effected.

## All cor, lit, and grid files exist!
## This program will now end!

Mean Phenotype Whole Simulation

First, I will look at a plot of how the mean phenotype of the entire population of newts and snakes changes overtime. Each of these plots has three colored lines, red for the mean newt phenotype, blue for the mean snake phenotype, and black for the difference between mean snake and mean newt phenotype. There are 4 line per color that represent the different replicas that I ran. Note this is a mean for the entire population. Mutational variance increases for the snakes as you go down the figure (top to bottom) and increases for the newts as you go across (left to right). The table shows the mean difference between snake and newt phenotype for the four trials. There are two sets of data 5 sections and 7 sections. Since we are looking at whole population means, the results of this section do not differ in data collection.

Phenotype differences (5 sec)

Table of average Differences (5 sec)

##                    Group.1          x
## 1  1e-08_0.005_1e-08_0.005  0.1228741
## 2   1e-08_0.005_1e-09_0.05 -1.9048949
## 3    1e-08_0.005_1e-10_0.5 -2.8179176
## 4      1e-08_0.005_1e-11_5 -0.4003216
## 5   1e-09_0.05_1e-08_0.005  2.7018939
## 6    1e-09_0.05_1e-09_0.05 -0.5188002
## 7     1e-09_0.05_1e-10_0.5 -1.1922474
## 8       1e-09_0.05_1e-11_5  0.4028800
## 9    1e-10_0.5_1e-08_0.005  1.8949091
## 10    1e-10_0.5_1e-09_0.05 -0.2873013
## 11     1e-10_0.5_1e-10_0.5 -0.4165329
## 12       1e-10_0.5_1e-11_5  0.3512759
## 13     1e-11_5_1e-08_0.005  0.1343393
## 14      1e-11_5_1e-09_0.05 -1.2582723
## 15       1e-11_5_1e-10_0.5 -1.3798476
## 16         1e-11_5_1e-11_5 -0.6589529

Phenotype differences (7 sec)

Table of average Differences (7 sec)

##                    Group.1           x
## 1  1e-08_0.005_1e-08_0.005  0.13112840
## 2   1e-08_0.005_1e-09_0.05 -2.39679271
## 3    1e-08_0.005_1e-10_0.5 -2.73768918
## 4      1e-08_0.005_1e-11_5 -0.25055622
## 5   1e-09_0.05_1e-08_0.005  2.61270891
## 6    1e-09_0.05_1e-09_0.05 -0.70695433
## 7     1e-09_0.05_1e-10_0.5 -1.14717140
## 8       1e-09_0.05_1e-11_5  0.38920777
## 9    1e-10_0.5_1e-08_0.005  2.10180688
## 10    1e-10_0.5_1e-09_0.05 -0.08745614
## 11     1e-10_0.5_1e-10_0.5 -0.36972292
## 12       1e-10_0.5_1e-11_5  0.43034445
## 13     1e-11_5_1e-08_0.005  0.56986732
## 14      1e-11_5_1e-09_0.05 -1.26474530
## 15       1e-11_5_1e-10_0.5 -1.46086858
## 16         1e-11_5_1e-11_5 -0.82308506

The mean phenotype of newts and snakes typically goes up as the number of generation increases. When there is a large gap in mutational variance (B, C, E, I) there tends to be some phenotype flat lining. Sometimes newt and snakes mean phenotypes are close, while at other times they are further apart. The average difference between snake and newt mean phenotypes -2.8 to 2.7 (which is very similar between the two simulation sets).

Connection between higher phenotype and population (5 sec)

In previous experiments like GA1 tall map, I saw a connection between newt/snake phenotype and population size. Typically, when a species had a higher phenotype they also had a larger population size. This relation between phenotype and population size had specific outcomes that depended on the GA of newts and snakes.

This section contains two figures for both data sets. The first figure compares the population size of newts and snakes to the difference between mean snake and mean newt phenotype for a time slice (5,000-10,000 generations). Color in this plot is the difference between snake and newt phenotype, with blue indicating snakes have a larger phenotype and red indicating newts have a larger phenotype. Cream color points indicate that the two phenotypes are nearly the same. The second figure present the histograms of the difference between snake and newt population size (green) and phenotype (purple) for a time slice (5,000-10,000 generations). Since the data in this section was calculated for whole population 5 sections and 7 sections can be considered the same.

Phenotype differences (5 sec)

Phenotype & Populationsize differences (5 sec)

Phenotype differences (7 sec)

Phenotype & Populationsize differences (7 sec)

I looked at the results between 5,000 - 10,000 generations where I saw most of the increase in newt and snake phenotype. The first plot has newt population size by snake population size with the color representing the difference between snake and newt mean phenotype (red=newts have a higher mean phenotype, blue=snakes have a higher mean phenotype). The third GA (an intermediate one) seems to do better then all of the other GAs (blue row (3), red column (3)). However, this is dependent on the point in the simulation you focus on. The next two plots show the difference between snake and newt population size (green) and phenotype (purple). For the most part these suggest that when the mean phenotype is higher so is the population size. This also shows the asymmetrical nature between the predator-prey interaction (higher max population sizes seen in snakes). The population size is also more variable than the phenotype (difference in population size has a wider range). There are minute differences between the 5 section and 7 section figures, but for the most part they are very similar to each-other (similar at the GA combinations level).

Correlation

The next section I aim to look at how correlated newt and snakes are at local populations across the geographical area for both the 5 and 7 sections. I am examining correlation between newt and snake phenotypes and I predicted that there would be a positive correlation between the phenotypes. I first look at the correlation between mean newt phenotype and mean snake phenotype for each of the four trials in every GA combination from 10,000-15,000 generations. The solid line is a 0 with a dashed line at the level of correlation seen in natural newt-snake population(s). I try to compare the 5 sections to 7 sections to see if there is a significant difference between the two. I will also compare these results to the rectangular correlation values.

Five sections

Seven sections

After looking at this figure, I can see that there is a range of correlation values. Most spatial phenotype correlation values are at or just above 0. There are a few that are very positive or negative. I do not see a huge difference between having 5 or 7 sections. It would have been better to have done one simulation and spit out these two values (I might rerun this experiment and have more grid values to test, but they would all be in one simulation). The spatial correlation results seem lower to the ones that are in GA1 tall (with rectangular grid slices). I tested the correlation calculation of squares vs rectangles with Peter in points.RMD. We used all individuals data points to see if there was a significant difference between squares and rectangles, but found none. It is possible that more or less grids would increase or decrease spatial phenotype correlation, but there might not be a perfect value to get the most accurate understanding of what is going on between newt and snake phenotypes on the local level.

Correlation Histograms (5 sec)

In order to understand how spatial correlations where changing with time I took 5,000 generation time slices to look at all four trials correlation values. Each color is a different trial per GA combination. The histogram values are stacked. This section only looks at the 5 sections.

Plot 1

Plot 2

Plot 3

Plot 4

Plot 5

Plot 6

Plot 7

Plot 8

Plot 9

Plot 10

The spatial correlations move around a lot. None of them line up on the real newt and snake spatial phenotype correlation value ~0.7. (More discussion after the correlation histograms 7 section)

Correlation Histograms (7 sec)

In order to understand how spatial correlations where changing with time I took 5,000 generation time slices to look at all four trials correlation values. Each color is a different trial per GA combination. The histogram values are stacked. These are the 7 section results.

Plot 1 (7 sec)

Plot 2 (7 sec)

Plot 3 (7 sec)

Plot 4 (7 sec)

Plot 5 (7 sec)

Plot 6 (7 sec)

Plot 7 (7 sec)

Plot 8 (7 sec)

Plot 9 (7 sec)

Plot 10 (7 sec)

After examining these sets of data, I cannot see a noticeable difference in newt-snake phenotype spatial correlation. I would like to run another experiment that collects correlation data for many different grid settings. I would also like to see how increasing the size of the map would effect these results. I wonder if the area where newts and snakes are co-evolving overlap a grid line. How can we tell what is occurring at the individual level? Like how specific (beneficial) mutations are moving within populations. Or how individuals might be moving into one area because of a lack of other individuals. Essentially, what is occurring in my simulation.

Correlation across time

Next, we will examine three randomly chosen plots from both the 5 section and 7 section experiment. Time (in generations) in on the x-axis and both mean phenotype and phenotype spatial correlation in on the y-axis. Newt whole population mean phenotype is red, while snake mean phenotype is blue. The pink line is the phenotype spatial correlation.

Random 1 (5 sec)

## [1] "pattern 1e-08_0.005_1e-08_0.005_2"
## [1] "Cor between average snake pheno and local cor 0.571192362910474"
## [1] "Cor between average newt pheno and local cor 0.664408601747308"
## [1] "Cor between average dif pheno and local cor -0.615771578072566"
## [1] "Cor between newt pheno and snake 0.822111813013718"

Random 2 (5 sec)

## [1] "pattern 1e-10_0.5_1e-09_0.05_0"
## [1] "Cor between average snake pheno and local cor -0.372655847213216"
## [1] "Cor between average newt pheno and local cor -0.387832763343737"
## [1] "Cor between average dif pheno and local cor 0.346696754771787"
## [1] "Cor between newt pheno and snake 0.912165028376019"

Random 3 (5 sec)

## [1] "pattern 1e-11_5_1e-08_0.005_1"
## [1] "Cor between average snake pheno and local cor 0.479310905820666"
## [1] "Cor between average newt pheno and local cor 0.170565894728146"
## [1] "Cor between average dif pheno and local cor 0.479388521800722"
## [1] "Cor between newt pheno and snake 0.386038678485869"

Random 1 (7 sec)

## [1] "pattern 1e-11_5_1e-08_0.005_3"
## [1] "Cor between average snake pheno and local cor 0.084251337014039"
## [1] "Cor between average newt pheno and local cor 0.421664348573081"
## [1] "Cor between average dif pheno and local cor 0.012405567960854"
## [1] "Cor between newt pheno and snake 0.430100767116733"

Random 2 (7 sec)

## [1] "pattern 1e-09_0.05_1e-10_0.5_2"
## [1] "Cor between average snake pheno and local cor 0.00139288166942614"
## [1] "Cor between average newt pheno and local cor 0.00274228758658211"
## [1] "Cor between average dif pheno and local cor -0.00141610724143437"
## [1] "Cor between newt pheno and snake 0.904531077811197"

Random 3 (7 sec)

## [1] "pattern 1e-09_0.05_1e-08_0.005_2"
## [1] "Cor between average snake pheno and local cor -0.00199389233788209"
## [1] "Cor between average newt pheno and local cor -0.186763422331012"
## [1] "Cor between average dif pheno and local cor 0.0640429634563006"
## [1] "Cor between newt pheno and snake 0.722208495036107"

All six of these plots show mean newt and snake phenotype increasing (very few show no change). Sometimes, the mean newt and snake phenotype are far apart, but often they are close together. The size of the mean phenotype jumps (small or large) is determined by the GA of the species (higher mutation effect size = larger jumps). These plots also show that the phenotype spatial correlation fluctuates between being positive and being negative. I wonder if there is a connection between the difference in newt and snake mean phenotype and the local spation correlation.

What happens over time (looking at the beginning, middle, and late part of my simulations)

This next section is just getting a glimpse at how newt & snake phenotype and population size differ over time. The populations start off with about 250 individuals each. Each individual has a different genetic background created from msprime. Then each msprime simulation is put into slim and data is generated. Plots show newt by snake population size, with the point color representing the difference between mean snake and newt phenotype (red=newts have a higher phenotype and blue=snakes have a higher phenotype). The other plots show histograms of difference between snakes and newts phenotype and population size (purple and green). First I will look at 5 sections then I will look at 7 sections. This is whole population data, there is no difference between the 5 and 7 sections measurements.

Pheno Beginning

Pheno Middle

Pheno End

Dif Beginning

Dif Middle

Dif End

Pheno Beginning (7 sec)

Pheno Middle (7 sec)

Pheno End (7 sec)

Dif Beginning (7 sec)

Dif Middle (7 sec)

Dif End (7 sec)

The results looked so similar between the 5 sections and 7 sections I had to double check my code! (probably because they had the same starting conditions) In the beginning of the simulation both newt and snake population grows. The difference in phenotype quickly becomes polarized. The population size reaches a steady point and then newts and snakes co-evolve. In the middle part of my simulation, the difference between newt and snakes mean phenotype solidifies. Often, the difference in mean phenotype decreases (where compared to the beginning of the simulation). When the GA has a high mutation rate and low mutation effect size (GA 1), the difference in mean phenotype grows. This leads to the species with GA 1 losing the co-evolutionary arms race. More frequent smaller steps does not help a species win in an arms race (might also be due to lower mutational variance). The histograms reflect what is seen in the scatter plots.

Summary

In the summary section, I try to come up with a way to show how different GA combinations can change the simulations results. In all of these plots snakes GA is represented by color and newt GA is represented by shape. There 16 color-shape combinations (with 4 repeats for trials). There are four sets of plots: 1) newt by snake population size, 2) phenotype difference by snake population size, 3) phenotype difference by snake GA, and 4) phenotype difference by newt GA. There are three figures in each set, taken at the begging, middle, and end time chunks. These are whole population calculations so the 5 section and 7 section data sets are not calculated differently.

Early-Sim Population Size Summary

Mid-Sim Population Size Summary

Late-Sim Population Size Summary

Early Difference Summary

Mid Difference Summary

Late Difference Summary

By Snake GA (Early)

By Snake GA (Mid)

By Snake GA (Late)

By Newt GA (Early)

By Newt GA (Mid)

By Newt GA (Late)

Early-Sim Population Size Summary (7 sec)

Mid-Sim Population Size Summary (7 sec)

Late-Sim Population Size Summary (7 sec)

Early Difference Summary (7 sec)

Mid Difference Summary (7 sec)

Late Difference Summary (7 sec)

By Snake GA (Early) (7 sec)

By Snake GA (Mid) (7 sec)

By Snake GA (Late) (7 sec)

By Newt GA (Early) (7 sec)

By Newt GA (Mid) (7 sec)

By Newt GA (Late) 7

There is a whole lot going on in theses summary plots! Firstly, both 5 section and 7 section are very similar (not unexpected because there is no calculation difference). The first plot (newt population size by snake population size), all the points (shapes and colors) are clustered. There is some color separation (blue/green at the top left and purple/red at the bottom right). There is also a little bit of a shape separation (circle at the top left and squares at the bottom right). At time goes on GA patterns emerge. There are lines of color and lines of shapes. The best GA for newts and snakes was 1e-09_0.05 (can be seen as green near the top of all the lines of shapes and triangles near the bottom right look at all the lines of color). In the phenotype difference by snake population size figure there is some color separation (blue/green at the top right and purple/red at the bottom left) and some shape separation (circle at the top right and squares at the bottom left). Over time the points spread out and more green-circles end up in the top right and red-triangles/squares end up in the bottom left. When putting these figures together it seems like the population size of snakes is lower when the newt phenotype is larger than the snake phenotype.

After looking at the By GA plots, I can see that there is both shape and color separation. Intermediate GA do better (1e-09_0.05 & 1e-10_0.5) in a co-evolutionary arms race. The GA with the lowest mutational variance (1e-08_0.005) did the worst. The benefit for a particular GA was not the same between species. For example, when newts had a GA of 1e-10_0.5 the mean phenotype difference ranged from -3 to 0, indicating that newts had a higher phenotype or where equal to snake phenotype no matter what GA the snake had. But when snake had a GA of 1e-10_0.5 the mean phenotype difference ranged from 2.2 to -1.5, indicating that snakes had a higher phenotype only when newts had certain GA (1e-08_0.005, 1e-11_5.0).

What I learned in this section:

Heatmap

In the heatmap plots each GA combination and trails is presented by combining newt GA in the x-axis to snake GA and trial number in the y-axis. The result is the color in that section. There are two types of heatmap plots shown below. One shows the average snake population size for a time chunk with darker colors indicating a smaller snake population and lighter colors indicating a larger snake population. The other heatmap shows the average difference between snake and newt mean phenotype (red=newts had a higher phenotype, blue=snakes had a higher phenotype). I look at 3 time slices for both types of heatmaps and for the 5 and 7 section results (measurements were the same).

Population Size (Early)

Population Size (Mid)

Population Size (Late)

Phenotype (Early)

Phenotype (Mid)

Phenotype (Late)

Heatmap (7 sec)

Population Size (Early) (7 sec)

Population Size (Mid) (7 sec)

Population Size (Late) (7 sec)

Phenotype (Early) (7 sec)

Phenotype (Mid) (7 sec)

Phenotype (Late) (7 sec)

Heatmap results are very similar between the 5 and 7 sections. At the beginning of the simulation, snake population size was large for all snake GAs and in 3 out of the 4 newt GAs. The third GA (1e-10_0.5) had less snakes and more newts. The snake population was even smaller when snakes had GA 4 (1e-11_5.0). As the number of generations increased to the mid section of my simulation (the end section was the same) snake population size was large for all snake GAs, but only in 2 out of the 4 newt GAs. The second and third GA (1e-09_0.05 & 1e-10_0.5) had less snakes and more newts. When looking at each column of newt GA, snakes GA 2 (1e-09_0.05) consistently had a larger snake population size. Conversely, when looking at the rows of snake GA, newt GA 2 (1e-09_0.05) consistently had a smaller snake population size. The same pattern is seen in the difference between phenotype heat map. First GA 3 (1e-10_0.5) has a phenotype advantage, then as the number of generations increases GA 2 (1e-09_0.05) has an advantage. Something I found interesting, is that on average newts have higher levels of toxicity than snakes levels of resistance. It seems strange that an intermediate level of mutational variance has a better outcome in a co-evolutionary arms race.

What is up with the correlations

This section goes over the results from the grid calculations, which are different for the 5 and 7 sections. I divided my map up into smaller area (grids) and calculated mean phenotype, SD phenotype, max phenotype, min phenotype, and population size. In each of these plots newts are represented by circles and snakes are represented by squares. Parameter values increase from a dark color to a lighter color (green-blue themed for phenotype, orange-pinked themed for population size) There is also a subplot that plots each parameter (mean, sd…) of newt by snake colored by map location (red=corner, green=edge, blue=middle). I look at the some simulation at one time in the begging, middle, and end.

Early Simulation Correlation

Mean

## [1] 0.4282603

SD

## [1] -0.05794028

Max Phenotype

## [1] 0.4025511

Min Phenotype

## [1] 0.242993

Population Size

## [1] -0.2920976

Middle Simulation Correlation

Mean

## [1] -0.0140173

SD

## [1] 0.2180298

Max Phenotype

## [1] 0.1438674

Min Phenotype

## [1] 0.01899735

Population Size

## [1] 0.3874851

Late Simulation Correlation

Mean

## [1] 0.30277

SD

## [1] 0.2940706

Max Phenotype

## [1] 0.3301998

Min Phenotype

## [1] 0.3831788

Population Size

## [1] 0.5742363

What is up with the correlations (7 sec)

Early Simulation Correlation (7 sec)

Mean (7 sec)

## [1] 0.3751063

SD (7 sec)

## [1] -0.1530689

Max Phenotype (7 sec)

## [1] 0.2984811

Min Phenotype (7 sec)

## [1] 0.2161704

Population Size (7 sec)

## [1] -0.2845804

Middle Simulation Correlation

Mean (7 sec)

## [1] 0.06761699

SD (7 sec)

## [1] 0.07612556

Max Phenotype (7 sec)

## [1] 0.0778987

Min Phenotype (7 sec)

## [1] 0.08233218

Population Size (7 sec)

## [1] 0.1521863

Late Simulation Correlation

Mean (7 sec)

## [1] 0.1479878

SD (7 sec)

## [1] 0.05354803

Max Phenotype (7 sec)

## [1] 0.1259599

Min Phenotype (7 sec)

## [1] 0.2548507

Population Size (7 sec)

## [1] 0.2318031